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LONG OLIGONUCLEOTIDE ARRAYS 

TNTRODUCTION 

5 Technical Field 

The field of this invention is nucleic acid arrays. 
Background of the Invention 

Nucleic acid arrays have become an increasingly important tool in the 
biotechnology industry and related fields. Nucleic acid arrays, in which a plurality of 

10 nucleic acids are deposited onto a solid support surface in the form of an array or partem, 
find use in a variety of applications, including drug screening, nucleic acid sequencing, 
mutation analysis, and the like. One important use of nucleic acid arrays is in the analysis 
of differential gene expression, where the expression of genes in different cells, normally 
a cell of interest and a control, is compared and any discrepancies in expression are 

1 5 identified. In such assays, the presence of discrepancies indicates a difference in the 
classes of genes expressed in the cells being compared. 

In methods of differential gene expression, arrays find use by serving as a substrate 
to which is bound nucleic acid "probe" fragments. One then obtains "targets" from at least 
two different cellular sources which are to be compared, e.g. analogous cells, tissues or 

20 organs of a healthy and diseased organism. The targets are then hybridized to the 

immobilized set of nucleic acid "probe" fragments. Differences between the resultant 
hybridization patterns are then detected and related to differences in gene expression in 
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the two sources. 

A number of different physical parameters of the array which is used in such 
assays can have a significant effect on the resuks that are obtained fi*om the assay. One 
physical parameter of nucleic acid arrays that can exert a significant influence over the 
5 nature of the results which are obtained from the array is probe size, i.e. the length of the 
individual probe nucleic acids stably associated with the surface of the solid support in the 
array. There are generally two different types of arrays currently finding use-(l) cDNA 
arrays, in which either full length or partial cDNAs are employed as probes; and (2) 
oligonucleotide arrays, in which probes of from about 8 to 25 nucleotides are employed. 

10 In currently used cDNA arrays, the double stranded cDNAs which may be 

substantially full length or partial fragments thereof are stably associated with the surface 
of a solid support, e.g. nylon membrane. Advantages of cDNA arrays include high 
sensitivity, which features stems from the high efficiency of binding of the cDNA probe to 
its target and the stringent hybridization and washing conditions that may be employed 

15 with such arrays. Disadvantages of cDNA arrays include difficulties in large scale 
production of such arrays, low reproducibility of such arrays, and the like. 

The other current alternative, oligonucleotide arrays, employs oligonucleotide 
probes in which each probe ranges from about 8 to 25, usually 20 to 35 nucleotides in 
length. While such arrays are more amenable to large scale production, they suffer from 

20 disadvantages as well. One significant disadvantage for such arrays is their lower 

sensitivity for target nucleic acids, as compared to cDNA arrays. Another disadvantage is 
the wide variation in hybridization efficiency of different probes for the same target in a 
given protocol, which feature requires the use of multiple oligonucleotide probes for the 
same target, which redundancy adds significantly to the cost of producing such arrays. 

25 As such, there is a continued interest in the development of new array formats. Of 

particular interest would be the development of array format which combined the high 
sensitivity of cDNA arrays with the high throughput manufacturability of oligonucleotide 
arrays, where the format would not suffer from the disadvantages experienced with cDNA 
and oligonucleotide arrays, as described above. 
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al. Nature Genet. (1999) 21 :5-9; Sohail, et al., RNA (1999) 5:646-655; Mir et al. Nature 
Biotech. (1999)17: 788-792; Beier, et al., Nucl. Acids Res. (1999) 27:1970-1977; Rogers, 
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15 606; Chen, et al., Nucl. Acids Res. (1999) 27:389-395; Maldonado-Rodriguez, et al., 
Molec. Biotech.(1999) 1 1:13-25; Lipshutz, et al.. Nature Genet. 1999, 21:20-24; Alon, et 
al, Proc. Natl. Acad. Sci. (1999) 96:6745-6750; Gunderson, et. al.. Genome Research 
(1998) 8:1 142-1 153; Gilles et al.. Nature Biotech. (1999) 17:365-370; Duggan, et al.. 
Nature Genet. (1999) 21:10-14, Brown, P.O., Nature Genet (1999) 21:33-37; Pollack, et 

20 al.. Nature Genet. (1999) 23:41-46; Wang et al.. Gene (1999) 229:101-108; Bowtell, 
Nature Genet. (1999) 21:25-32; Schena, et al, TIBS (1998) 16:301-306; Debouck et al.. 
Nature Genet. (1999) 21:48-50; The Microarray Meeting. Technology, Application and 
Analysis. Mountain Shadows Marriott Resort Scottsdale, Arizona, September 22-25, 
1999. Abstracts: 6-85; Gerhold et al.. Trends in Biochem. Sciences. (1999) 24:168-173; 

25 Graves et al.. Trends in Biotech. (1999) 17:127-134; Ekins et al.. Trends in Biotech. 
(1999,) 17:217-218; Atlas Human cDNA Expression Array I (April 1997) 
CLONTECHniques XII: 4-7; Lockhart et al.. Nature Biotechnology (1996) 14: 1675- 
1680; Shena et al.. Science (1995) 270: 467-470; Schena et al., Proc. Nat'l Acad. Sci. 
USA (1996)93:10614-10619; and Chalifour et al.. Anal. Biochem. (1994) 216:299-304. 
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SUMMARY OF THE INVENTION 
Long oligonucleotide arrays, as well as methods for their preparation and use in 
hybridization assays, are provided. The subject arrays are characterized in that at least a 
5 portion of the probes of the array, and usually all of the probes of the array, are long 
oligonucleotides, e.g. oligonucleotides having a length of from about 50 to 120 nt. Each 
long oligonucleotide probe on the array is preferably chosen to exhibit high target binding 
efficiency and low non-specific binding under conditions in which the array is employed, 
e.g. stringent hybridization conditions. In many embodiments, the specific probe 
1 0 oligonucleotides are chosen so that they have substantially the same hybridization 

efficiency to their respective targets. The subject arrays find use in a number of different 
applications, e.g. differential gene expression analysis. 

BRIEF DESCRIPTION OF THE FIGURES 
15 Fig. 1 provides a graphical representation of the hybridization efficiency of 

different length oligonucleotides. 

DEFINITIONS 

The term "nucleic acid" as used herein means a polymer composed of nucleotides, 
e.g. naturally occurring deoxyribonucleotides or ribonucleotides, as well as synthetic 
20 mimetics thereof which are also capable of participating in sequence specific, Watson- 
Crick type hybridization reactions, such as is found in peptide nucleic acids, etc. 

The terms "ribonucleic acid" and "RNA" as used herein mean a polymer 
composed of ribonucleotides. 

The terms "deoxyribonucleic acid" and "DNA" as used herein mean a polymer 
25 composed of deoxyribonucleotides. 

The term "short oligonucleotide" as used herein denotes single stranded nucleotide 
multimers of from about 8 to 50 nucleotides in length, i.e. 8 to 50 mers. 

The term "long oligonucleotide" as used herein denotes single stranded nucleotide 
multimers of from about 50 to 150, usually from about 50 to 120, nucleotides in length, 
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e.g. a 50 to 150 mer, 50 to 120 mer, etc. 

The term "polynucleotide" as used herein refers to single or double stranded 
polymer composed of nucleotide monomers of greater than about 150 nucleotides in 
length up to about 5000 nucleotides in length. 
5 The term "oligonucleotide probe composition" refers to the nucleic acid 

composition that makes up each of the probes spots on the array that correspond to a 
target nucleic acid. Thus, oligonucleotide probe compositions of the subject arrays are 
nucleic acid compositions of a plurality of long oligonucleotides, where the composition 
may be homogenous or heterogenous with respect to the long oligonucleotides that make 

10 up the probe composition, i.e. each of the long oligonucleotides in the probe composition 
may have the same sequence such that they are identical or each of the probe 
compositions may be made up of two or more different long oligonucleotides that differ 
from each other in terms of sequence. 

The term "target nucleic acid" means a nucleic acid for which there is one or more 

15 corresponding oligonucleotide probe compositions, i.e. probe oligonucleotide spots, 

present on the array. The target nucleic acid may be represented by one or more different 
oligonucleotide probe compositions on the array. The target nucleic acid is a nucleic acid 
of interest in a sample being tested with the array, where by "of interest" is meant that the 
presence or absence of target in the sample provides useful information, e.g. unique and 

20 defining characteristics, about the genetic profile of the cell(s) firom which the sample is 
prepared. As such, target nucleic acids are not housekeeping genes or other types of genes 
which are present in a number of diverse cell types and therefore the presence or absence 
of which does not provide characterizing information about a particular cell's genetic 
profile. 

25 The terms "background" or "background signal intensity" refers to hybridization 

signals resulting fi'om non-specific binding of labeled target to the substrate component of 
the array. Background signals may also be produced by intrinsic fluorescence of the array 
components themselves. A single background signal can be calculated for the entire array, 
or a different background signal may be calculated for each target nucleic acid. 

B,F&FRef: CLON-015 
ClontechRef: P-103 

F.\DOCUMENT\CLON\015\patent application wpd -5- 



The term "non-specific hybridization" refers to the non specific binding or 
hybridization of a target nucleic acid to a nucleic acid present on the array surface, e.g. a 
long oligonucleotide probe of a probe spot on the array surface, a nucleic acid of a control 
spot on the array surface, and the like, where the target and the probe are not substantially 
5 complementary. 

DESCRIPTION OF THE SPECIFIC EMBODIMENTS 
Long oligonucleotide arrays, as well as methods for their preparation and use in 
hybridization assays, are provided. The subject arrays are characterized in that at least a 

1 0 portion or fraction, usually a majority of or substantially all of the probes of the array, and 
usually all of the probes of the array, are long oligonucleotides, e.g. oligonucleotides 
having a length of from about 50 to 120 nt. Each long oligonucleotide probe on the array 
is preferably chosen to exhibit high target binding efficiency and low non-specific 
hybridization under conditions in which the array is employed, e.g. stringent conditions. In 

1 5 certain embodiments, the arrays are further characterized in that each of the distinct probes 
on the array has substantially the same hybridization efficiency for its respective target. 
The subject arrays find particular use in gene expression assays. In fiirther describing the 
subject invention, the arrays will first be described in general terms. Next, methods for 
their preparation are described. Following this description, a review of representative 

20 applications in which the subject arrays may be employed is provided. 

Before the subject invention is described fiarther, it is to be understood that the 
invention is not limited to the particular embodiments of the invention described below, as 
variations of the particular embodiments may be made and still fall within the scope of the 
25 appended claims. It is also to be imderstood that the terminology employed is for the 

purpose of describing particular embodiments, and is not intended to be limiting. Instead, 
the scope of the present invention will be established by the appended claims. 

In this specification and the appended claims, the singular forms "a," "an," and 
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"the" include plural reference unless the context clearly dictates otherwise. Unless defined 
otherwise, all technical and scientific terms used herein have the same meaning as 
commonly understood to one of ordinary skill in the art to which this invention belongs. 

5 Arrays of the Subject Invention-General Description 

The arrays of the subject invention have a plurality of probe spots stably associated 
with a surface of a solid support. A feature of the subject arrays is that at least a portion of 
the probe spots, and preferably substantially all of the probe spots on the array are probe 
10 oligonucleotide spots, where each probe oligonucleotide spot on the array comprises an 
oligonucleotide probe composition made up of a plurality of long oligonucleotides of 
known identity, usually of known sequence, as described in greater detail below. 

Probe Spots of the Arrays 

15 

As mentioned above, a feature of the subject invention is the nature of the probe 
spots, i.e. that at least a portion of, and usually substantially all of, the probe spots on the 
array are made up of probe nucleic acid compositions of long oligonucleotides. Each 
probe spot on the surface of the substrate is made up of long oligonucleotide probes, 

20 where the spot may be homogeneous with respect to the nature of the long oligonucleotide 
probes present therein or heterogenous, e.g. as described in U.S. Patent Application Serial 
No. 60/104,179, the disclosure of which is herein incorporated by reference. A feature of 
the oligonucleotide probe compositions is that the probe compositions are made up of 
long oligonucleotides. As such, the oligonucleotide probes of the probe compositions 

25 range in length from about 50 to 150, typically from about 50 to 120 nt and more usually 
from about 60 to 100 nt, where in many preferred embodiments the probes range in length 
from about 65 to 85 nt. 

In addition to the above length characteristics, the long oligonucleotide probes that 
make up the probe spots in the above are typically characterized by one or more of the 
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following features in many preferred embodiments of the subject invention. One further 
characterization of the long oUgonucleotides probes that make up the subject arrays is that 
their sequence is chosen to provide for high binding efficiency to their complementary 
target under stringent conditions. Binding efficiency refers to the ability of the probe to 
5 bind to its target under the hybridization conditions in which the array is used. Put another 
way, binding efficiency refers to the duplex yield obtainable with a given probe and its 
target after performing a hybridization experiment. In many embodiments, the probes 
present on the array surface that exhibit high binding efficiency having a binding 
efficiency for their target of 0.1%, usually at least 0.5 % and more usually at least 2%. 

10 Furthermore, the sequence of the long oligonucleotide probes is chosen to provide 

for low non-specific hybridization or non-specific binding, i.e. unwanted cross- 
hybridization, to target nucleic acids for which the probes are not substantially 
complementary under stringent conditions. A give target is considered to be substantially 
non-complementary to a given probe in the target has homology to the probe of less than 

15 60%, more commonly less than 50%> and most commonly less than 40%, as determined 
using the BLAST program with default settings. In certain embodiments, oligonucleotide 
probes having low non-specific hybridization characteristics and finding use in the subject 
arrays are those in which their relative ability to hybridize to non-complementary nucleic 
acids, i.e., other targets for which they are not substantially complementary, is less 10 %, 

20 usually less than 5 % and preferably less than 1 %> of their ability to bind to their 

complementary target. For example, in a side-by-side hybridization assay, probes having 
low non-specific hybridization characteristics are those which generate a positive signal, if 
any, when contacted with a target composition that does not include a complementary 
target for the probe, that is less than about 10%, usually least than about 3% and more 

25 usually less than about 1%> of the signal that is generated by the same probe when it is 
contacted with a target composition that includes a complementary target. 

In addition, the long oligonucleotides of a given spot are chosen so that each long 
oligonucleotide probe present on the array, or at least its target specific sequence, is not 
homologous with any other distinct unique long oligonucleotide present on the array, i.e. 
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any other oligonucleotide probe on the array with a different base sequence. In other 
words, each distinct oligonucleotide of a probe composition corresponding to a first target 
does not cross-hybridize with, or have the same sequence as, any other distinct unique 
oligonucleotide of any probe composition corresponding to a different target, i.e. an 
5 oligonucleotide of any other oligonucleotide probe composition that is represented on the 
array. As such, the sense or anti-sense nucleotide sequence of each unique oligonucleotide 
of a probe composition will have less than 90% homology, usually less than 70% 
homology, and more usually less than 50% homology with any other different 
oligonucleotide of a probe composition corresponding to a different target of the array, 
1 0 where homology is determined by sequence analysis comparison using the FASTA 
program using defauh settings. The sequence of unique oligonucleotides in the probe 
compositions are not conserved sequences found in a number of different genes (at least 
two), where a conserved sequence is defined as a stretch of from about 15 to 150 
"nucleotides which have at least about 90% sequence identity, where sequence identity is 
1 5 measured as above. 

The oligonucleotides of each probe composition, or at least the portion of these 
oligonucleotides that is complementary to their intended targets, i.e. their target specific 
sequences, are further characterized as follows. First, they have a GC content of from 
about 35 % to 80%, usually between about 40 to 70%. Second, they have a substantial 
20 absence of: (a) secondary structures, e.g. regions of self-complementarity (e.g. hairpins), 
structures formed by intramolecular hybridization events; (b) long homopolymeric 
stretches, e.g. polyA stretches, such that in any give homopolymeric stretch, the number of 
contiguous identical nucleotide bases does not exceed 5; (c) long stretches characterized 
by or enriched by the presence of repeating motifs, e.g GAGAGAGA, GAAGAGAA, etc.; 
25 (d) long stretches of homopurine or homopyrimidine rich motifs; and the like. 

The long oligonucleotide probes of the subject invention may be made up solely of 
the target specific sequence as described above, e.g. sequence designed or present which is 
intended for hybridization to the probe's corresponding target, or may be modified to 
include one or more non-target complementary domains or regions, e.g. at one or both 
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termini of the probe, where these domains may be present to serve a number of functions, 
including attachment to the substrate surface, to introduce a desired conformational 
structure into the probe sequence, etc. One optional domain or region that may be present 
at one or more both termini of the long oligonucleotide probes of the subject arrays is a 

5 region enriched for the presence of thymidine bases, e.g. an oligo dT region, where the 
number of nucleotides in this region is typically at least 3, usually at least 5 and more 
usually at least 10, where the number of nucleotides in this region may be higher, but 
generally does not exceed about 25 and usually does not exceed about 20, where at least a 
substantial proportion of, if not all of, the nucleotides in this region include a thymidine 

10 base, where by substantial proportion is meant at least about 50, usually at least about 70 
and more usually at least about 90 number % of all nucleotides in the oligo dT region. 
Certain probes of this embodiment of the subject invention, i.e. those in which the T 
enriched domain is an oligo dT domain, may be described by the following formula: 

15 wherein: 

T is dTMP; 

is the target specific sequence of the probe in which N is either dTMP, dGMP, 
dCMP or dAMP and m is from 50 to 100; and 

n and k are independently from 0 to 15, where when present n and/or k are 
20 preferably 5 to 10. 

In yet other embodiments and often in addition to the above described T enriched 
domains, the subject probes may also include domains that impart a desired constrained 
structure to the probe, e.g. impart to the probe a structure which is fixed or has a restricted 
conformation. In many embodiments, the probes include domains which flank either end 
25 of the target specific domain and are capable of imparting a hairpin loop structure to the 
probe, whereby the target specific sequence is held in confined or limited conformation 
which enhances its binding properties with respect to its corresponding target during use. 
In these embodiments, the probe may be described by the following formula: 

T„-Np-N,-N,-T, 
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wherein: 

T is dTMP; 

N is dTMP, dGMP, dCMP or dAMP; 
m is an integer from 50 to 100; 
5 n and k are independently from 0 to 1 5, where when present n and/or k are 

preferably 5 to 10, where in many embodiments k=n=5 to 10, more preferably 10; and 
p and o are independently 5 to 20, usually 5 to 15, and more usually about 10, 
wherein in many embodiments p=o=5 to 15 and preferably 10; 
such that is the target specific sequence; and 
10 and Np are self complementary sequences, e.g. they are complementary to each 

other, such that under hybridizing conditions the probe forms a hairpin loop structure in 
which the stem is made up of the N^, and Np sequences and the loop is made up of the 
target specific sequence, i.e. N^. 

The oligonucleotide probe compositions that make up each oligonucleotide probe 
1 5 spot on the array will be substantially, usually completely, free of non-nucleic acids, i.e. 
the probe compositions will not include or be made up of non-nucleic acid biomolecules 
found in cells, such as proteins, lipids, and polysaccharides. In other words, the 
oligonucleotide spots of the arrays are substantially, if not entirely, free of non-nucleic 
acid cellular constituents. 
20 The oligonucleotide probes may be nucleic acid, e.g. RNA, DNA, or nucleic acid 

mimetics, e.g. nucleic acids that differ from naturally occurring nucleic acids in some 
maimer, e.g. through modified backbones, sugar residues, bases, etc., such as nucleic acids 
comprising non-naturally occurring heterocyclic nitrogenous bases, peptide-nucleic acids, 
locked nucleic acids (see Singh & Wengel, Chem. Commun, (1998) 1247-1248); and the 
25 like. In many embodiments, however, the nucleic acids are not modified with a 

functionality which is necessary for attachment to the substrate surface of the array, e.g. an 
amino functionality, biotin, etc. 

The oligonucleotide probe spots made up of the long oligonucleotides described 
above and present on the array may be any convenient shape, but will typically be circular, 
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elliptoid, oval or some other analogously curved shape. The total amount or mass of 
oligonucleotides present in each spot will be sufficient to provide for adequate 
hybridization and detection of target nucleic acid during the assay in which the array is 
employed. Generally, the total mass of oligonucleotides in each spot will be at least about 

5 0. 1 ng, usually at least about 0.5 ng and more usually at least about 1 ng, where the total 
mass may be as high as 100 ng or higher, but will usually not exceed about 20 ng and 
more usually will not exceed about 10 ng. The copy number of all of the oligonucleotides 
in a spot will be sufficient to provide enough hybridization sites for target molecule to 
yield a detectable signal, and will generally range from about 0.001 fmol to 10 fmol, 

10 usually from about 0.005 fmol to 5 fmol and more usually from about 0.01 fmol to 1 fmol. 
Where the spot is made up of two or more distinct oligonucleotides of differing sequence, 
the molar ratio or copy number ratio of different oligonucleotides within each spot may be 
about equal or may be different, wherein when the ratio of unique oligonucleotides within 
each spot differs, the magnitude of the difference will usually be at least 2 to 5 fold but 

15 will generally not exceed about 10 fold. Where the spot has an overall circular dimension, 
the diameter of the spot will generally range from about 10 to 5,000 jum, usually from 
about 20 to 1,000 /^m and more usually from about 50 to 500 /^m. The surface area of 
each spot is at least about 100 /^m^ usually at least about 200 i^m^ and more usually at 
least about 400 jum^, and may be as great as 25 mm^ or greater, but will generally not 

20 exceed about 5 mm^ and usually will not exceed about 1 mm^. 

Array Features 

The arrays of the subject invention are characterized by having a plurality of probe 
25 spots as described above stably associated with the surface of a solid support. The density 
of probe spots on the array, as well as the overall density of probe and non-probe nucleic 
acid spots ( where the latter are described in greater detail infra) may vary greatly. As used 
herein, the term nucleic acid spot refers to any spot on the array surface that is made up of 
nucleic acids, and as such includes both probe nucleic acid spots and non-probe nucleic 
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acid spots. The density of the nucleic acid spots on the solid surface is at least about 5/cm^ 
and usually at least about 10/cm^ and may be as high as 1000/cm^ or higher, but in many 
embodiments does not exceed about 1000/cm^ and in these embodiments usually does not 
exceed about 500/cm^ or 400/cm^ and in certain embodiments does not exceed about 
5 300/cml The spots may be arranged in a spatially defined and physically addressable 
manner, in any convenient pattern across or over the surface of the array, such as in rows 
and columns so as to form a grid, in a circular pattern, and the like, where generally the 
pattern of spots will be present in the form of a grid across the surface of the solid support. 
In the subject arrays, the spots of the pattern are stably associated with the surface 

10 of a solid support, where the support may be a flexible or rigid support. By "stably 
associated" it is meant that the oligonucleotides of the spots maintain their position 
relative to the solid support under hybridization and washing conditions. As such, the 
oligonucleotide members which make up the spots can be non-covalently or covalently 
stably associated with the support surface based on technologies well known to those of 

1 5 skill in the art. Examples of non-co valent association include non-specific adsorption, 
binding based on electrostatic (e.g. ion, ion pair interactions), hydrophobic interactions, 
hydrogen bonding interactions, specific binding through a specific binding pair member 
covalently attached to the support surface, and the like. Examples of covalent binding 
include covalent bonds formed between the spot oligonucleotides and a functional group 

20 present on the surface of the rigid support, e.g. -OH, where the functional group may be 
naturally occurring or present as a member of an introduced linking group. In many 
preferred embodiments, the nucleic acids making up the spots on the array surface, or at 
least the long oligonucleotides of the probe spots, are covalently bound to the support 
surface, e.g. through covalent linkages formed between moieties present on the probes 

25 (e.g. thymidine bases) and the substrate surface, etc. 

As mentioned above, the array is present on either a flexible or rigid substrate. By 
flexible is meant that the support is capable of being bent, folded or similarly manipulated 
without breakage. Examples of solid materials which are flexible solid supports with 
respect to the present invention include membranes, flexible plastic films, and the like. By 
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rigid is meant that the support is solid and does not readily bend, i.e. the support is not 
flexible. As such, the rigid substrates of the subject arrays are sufficient to provide 
physical support and structure to the polymeric targets present thereon under the assay 
conditions in which the array is employed, particularly under high throughput handling 
5 conditions. Furthermore, when the rigid supports of the subject invention are bent, they 
are prone to breakage. 

The solid supports upon which the subject patterns of spots are presented in the 
subject arrays may take a variety of configurations ranging from simple to complex, 
depending on the intended use of the array. Thus, the substrate could have an overall slide 

1 0 or plate configuration, such as a rectangular or disc configuration. In many embodiments, 
the substrate will have a rectangular cross-sectional shape, having a length of from about 
10 mm to 200 mm, usually fi'om about 40 to 150 mm and more usually from about 75 to 
125 mm and a width of fi'om about 10 mm to 200 mm, usually from about 20 mm to 120 
mm and more usually from about 25 to 80 mm, and a thickness of from about 0.01 mm to 

15 5.0 mm, usually from about 0.1 mm to 2 mm and more usually from about 0.2 to 1 mm. 
Thus, in one representative embodiment the support may have a micro-titre plate format, 
having dimensions of approximately 125x85 mm. In another representative embodiment, 
the support may be a standard microscope slide with dimensions of from about 25 x 75 
mm. 

20 The substrates of the subject arrays may be fabricated from a variety of materials. 

The materials from which the substrate is fabricated should ideally exhibit a low level of 
non-specific binding during hybridization events. In many situations, it will also be 
preferable to employ a material that is transparent to visible and/or UV light. For flexible 
substrates, materials of interest include: nylon, both modified and unmodified, 

25 nitrocellulose, polypropylene, and the like, where a nylon membrane, as well as 
derivatives thereof, is of particular interest in this embodiment. For rigid substrates, 
specific materials of interest include: glass; plastics, e.g. polytetrafluoroethylene, 
polypropylene, polystyrene, polycarbonate, and blends thereof, and the like; metals, e.g. 
gold, platinum, and the like; etc. Also of interest are composite materials, such as glass or 
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plastic coated with a membrane, e.g. nylon or nitrocellulose, etc. 

The substrates of the subject arrays comprise at least one surface on which the 
pattern of spots is present, where the surface may be smooth or substantially planar, or 
have irregularities, such as depressions or elevations. The surface on which the pattern of 
5 spots is present may be modified with one or more different layers of compounds that 
serve to modify the properties of the surface in a desirable manner. Such modification 
layers, when present, v^U generally range in thickness from a monomolecular thickness to 
about 1 mm, usually from a monomolecular thickness to about 0.1 mm and more usually 
from a monomolecular thickness to about 0.001 mm. Modification layers of interest 

10 include: inorganic and organic layers such as metals, metal oxides, polymers, small 
organic molecules and the like. Polymeric layers of interest include layers of: peptides, 
proteins, polynucleic acids or mimetics thereof, e.g. peptide nucleic acids and the like; 
polysaccharides, phospholipids, polyurethanes, polyesters, polycarbonates, polyureas, 
polyamides, polyethyleneamines, polyarylene sulfides, polysiloxanes, polyimides, 

1 5 polyacetates, polyacrylamides, and the like, where the polymers may be hetero- or 

homopolymeric, and may or may not have separate functional moieties attached thereto, 
e.g. conjugated. 

The total number of spots on the substrate will vary depending on the number of 
different oligonucleotide probe spots (oligonucleotide probe compositions) one wishes to 

20 display on the surface, as well as the number of non probe spots, e.g control spots, 
orientation spots, calibrating spots and the like, as may be desired depending on the 
particular application in which the subject arrays are to be employed. Generally, the 
pattem present on the surface of the array will comprise at least about 10 distinct nucleic 
acid spots, usually at least about 20 nucleic acid spots, and more usually at least about 50 

25 nucleic acid spots, where the number of nucleic acid spots may be as high as 10,000 or 
higher, but will usually not exceed about 5,000 nucleic acid spots, and more usually will 
not exceed about 3,000 nucleic acid spots and in many instances will not exceed about 
2,000 nucleic acid spots. In certain embodiments, it is preferable to have each distinct 
probe spot or probe composition be presented in duplicate, i.e. so that there are two 
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duplicate probe spots displayed on the array for a given target. In certain embodiments, 
each target represented on the array surface is only represented by a single type of 
oligonucleotide probe. In other words, all of the oligonucleotide probes on the array for a 
give target represented thereon have the same sequence. In certain embodiments, the 
5 number of spots will range from about 200 to 1200. The number of probe spots present in 
the array will typically make up a substantial proportion of the total number of nucleic 
acid spots on the array, where in many embodiments the number of probe spots is at least 
about 50 number %, usually at least about 80 number % and more usually at least about 
90 number % of the total number of nucleic acid spots on the array. As such, in many 

1 0 embodiments the total number of probe spots on the array ranges from about 50 to 20,000, 
usually from about 100 to 10,000 and more usually from about 200 to 5,000. 

In the arrays of the subject invention (particularly those designed for use in high 
throughput applications, such as high throughput analysis applications), a single pattern of 
oligonucleotide spots may be present on the array or the array may comprise a plurality of 

1 5 different oligonucleotide spot patterns, each pattern being as defined above. When a 
plurality of different oligonucleotide spot patterns are present, the patterns may be 
identical to each other, such that the array comprises two or more identical 
oligonucleotide spot patterns on its surface, or the oligonucleotide spot patterns may be 
different, e.g. in arrays that have two or more different types of target nucleic acids 

20 represented on their surface, e.g an array that has a pattern of spots corresponding to 
human genes and a pattern of spots corresponding to mouse genes. Where a plurality of 
spot patterns are present on the array, the number of different spot patterns is at least 2, 
usually at least 6, more usually at least 24 or 96, where the number of different patterns 
will generally not exceed about 384. 

25 Where the array comprises a plurality of oligonucleotide spot patterns on its 

surface, preferably the array comprises a plurality of reaction chambers, wherein each 
chamber has a bottom surface having associated therewith an pattern of oligonucleotide 
spots and at least one wall, usually a plurality of walls surrounding the bottom surface. 
See e.g. U.S. Patent No. 5,545,531, the disclosure of which is herein incorporated by 
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reference. Of particular interest in many embodiments are arrays in which the same 
pattern of spots in reproduced in 24 or 96 different reaction chambers across the surface of 
the array. 

Within any given pattern of spots on the array, there may be a single spot that 
5 corresponds to a given target or a number of different spots that correspond to the same 
target, where when a plurality of different spots are present that correspond to the same 
target, the probe compositions of each spot that corresponds to the same target may be 
identical or different. In other words, a plurality of different targets are represented in the 
pattern of spots, where each target may correspond to a single spot or a plurality of spots, 

10 where the oligonucleotide probe composition among the plurality of spots corresponding 
to the same target may be the same or different. Where a plurality of spots (of the same or 
different composition) corresponding to the same target is present on the array, the 
number of spots in this plurality will be at least about 2 and may be as high as 10, but will 
usually not exceed about 5. As mentioned above, however, in many preferred 

1 5 embodiments, however, any given target nucleic acid is represented by only a single type 
of probe spot, which may be present only once or multiple times on the array surface, e.g. 
in duplicate, triplicate etc. 

The number of different targets represented on the array is at least about 2, usually 
at least about 10 and more usually at least about 20, where in many embodiments the 

20 number of different targets, e.g. genes, represented on the array is at least about 50 and 
more usually at least about 100. The number of different targets represented on the array 
may be as high as 5,000 or higher, but in many embodiments will usually not exceed 
about 3,000 and more usually will not exceed about 2,500. A target is considered to be 
represented on an array if it is able to hybridize to one or more probe compositions on the 

25 array. 

Another feature of the present invention is that the relative binding efficiencies of 
each of the distinct long oligonucleotide probes for their respective targets is substantially 
the same, such that the binding efficiency of any two different long oligonucleotide probes 
on the arrays for their respective targets does not vary by more than about 20 fold, usually 
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by not more than about 15 fold and more usually by not more than about 10 fold, where in 
many embodiments the binding efficiencies do not vary by more than about 5 fold and 
preferably by not more than about 3 fold. 

In certain preferred embodiments of the invention, each of the probe spots in the 
5 array comprising the long oligonucleotide probe compositions correspond to the same 
kind of gene; i.e. genes that all share some common characteristic or can be grouped 
together based on some common feature, such as species of origin, tissue or cell of origin, 
functional role, disease association, etc. In this embodiment, each of the different target 
nucleic acids that corresponds to the different probe spots on the array are of the same 

10 type, i.e. that are coding sequences of the same type of gene. As such, the arrays of this 
embodiment of the subject invention will be of a specific array type. A variety of specific 
array types are provided by the subject invention. Specific array types of interest include: 
human, cancer, apoptosis, cardiovascular, cell cycle, hematology, mouse, human stress, 
mouse stress, oncogene and tumor suppressor, cell-cell interaction, cytokine and cytokine 

1 5 receptor, rat, rat stress, blood, mouse stress, neurobiology, and the like. For a more 
detailed description of the different target nucleic acids represented on at least some of 
these types of arrays, see PCTAJS98/10561 the disclosure of which is herein incorporated 
by reference, as well as: U.S. Patent Application Serial No. 08/859,998; U.S. Patent 
Application Serial No. 08/974,298; U.S. Patent Application Serial No.09/225,998; U.S. 

20 Application Serial No. 09/221,480; U.S. Application Serial No. 09/222,432; U.S. 
Application Serial No. 09/222,436; U.S. Application Serial No. 09/222,437; U.S. 
Application Serial No. 09/222,251; U.S. Application Serial No.09/221,481; U.S. 
Application Serial No.09/222,256; U.S. Application Serial No. 09/222,248; and U.S. 
Application Serial No. 09/222,253; U.S. Application Serial No. 

25 (entitled "Human Cardiovascular Array," and having Att'y docket no. CLON- 

006CIP10);U,S. Application Serial No. (entitled "Human 

Neurobiology Array," and having Att'y docket no. CLON-006CIP1 1); U.S. Application" 
Serial No. (entitled "Rat Array," and having Att'y docket no. 
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CLON-006CIP12); U.S. Application Serial No. (entitled 

"Human Array," and having Att'y docket no. CLON-006CIP13); U.S. Application Serial 

No. (entitled "Cancer Array," and having Att'y docket no. 

CLON-006CIP14); U.S. Application Serial No. (entitled 

5 "Hematology/Immunology Array," and having Att'y docket no. CLON-006CIP15); U.S. 

Application Serial No. (entitled "Mouse Stress/Toxicology 

Array/' and having Att'y docket no. CLON-006CIP17); and U.S. Application Serial No. 

(entitled "Rat Stress/Toxicology Array," and having Att'y 

docket no. CLON-006CIP18); the disclosures of which are incorporated herein by 

10 reference. In many embodiments, at least 20 different, usually at least 30 different and 
often at least 50 different genes and in many embodiments at least 100 of different genes 
from the tables of genes listed in these applications are represented on the subject arrays. 

With respect to the oligonucleotide probes that correspond to a particular type or 
kind of gene, type or kind can refer to a plurality of different characterizing features, 

15 where such features include: species specific genes, where specific species of interest 
include eukaryotic species, such as mice, rats, rabbits, pigs, primates, humans, etc.; 
function specific genes, where such genes include oncogenes, apoptosis genes, cytokines, 
receptors, protein kinases, etc.; genes specific for or involved in a particular biological 
process, such as apoptosis, differentiation, stress response, aging, proliferation, etc.; 

20 cellular mechanism genes, e.g. cell-cycle, signal transduction, metabolism of toxic 

compounds, etc.; disease associated genes, e.g. genes involved in cancer, schizophrenia, 
diabetes, high blood pressure, atherosclerosis, viral-host interaction and infection diseases, 
etc.; location specific genes, where locations include organ, such as heart, liver, prostate, 
lung etc., tissue, such as nerve, muscle, connective, etc., cellular, such as axonal, 

25 lymphocytic, etc, or subcellular locations, e.g. nucleus, endoplasmic reticulum, Golgi 
complex, endosome, lysosome, peroxisome, mitochondria, cytoplasm, cytoskeleton, 
plasma membrane, extracellular space, chromosome-specific genes; specific genes that 
change expression level over time, e.g. genes that are expressed at different levels during 
the progression of a disease condition, such as prostate genes which are induced or 
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repressed during the progression of prostate cancer. 

In addition to the oligonucleotide spots comprising the oligonucleotide probe 
compositions (i.e. oligonucleotide probe spots), the subject arrays may comprise one or 
more additional spots of polynucleotides or nucleic acid spots which do not correspond to 
5 target nucleic acids as defined above, such as target nucleic acids of the type or kind of 
gene represented on the array in those embodiments in which the array is of a specific 
type. In other words, the array may comprise one or more non probe nucleic acid spots 
that are made of non "unique" oligonucleotides or polynucleotides, i.e common 
oligonucleotides or polynucleotides. For example, spots comprising genomic DNA may 

10 be provided in the array, where such spots may serve as orientation marks. Spots 

comprising plasmid and bacteriophage genes, genes from the same or another species 
which are not expressed and do not cross hybridize with the cDNA target, and the like, 
may be present and serve as negative controls. In addition, spots comprising a plurality of 
oligonucleotides complimentary to housekeeping genes and other control genes from the 

15 same or another species may be present, which spots serve in the normalization of mRNA 
abundance and standardization of hybridization signal intensity in the sample assayed with 
the array. Orientation spots may also be included on the array, where such spots serve to 
simplify image analysis of hybrid patterns. Other types of spots include spots for 
calibration or quantitative standards, controls for integrity of RNA template (targets), 

20 controls for efficiency steps in target preparation (such as efficiency of labeling, 

purification and hybridization), etc. These latter types of spots are distinguished from the 
oligonucleotide probe spots, i.e. they are non-probe spots. 

Array Preparation 

25 

The subject arrays can be prepared using any convenient means. One means of 
preparing the subject arrays is to first synthesize the oligonucleotides for each spot and 
then deposit the oligonucleotides as a spot on the support surface. The oligonucleotides 
may be prepared using any convenient methodology, where chemical synthesis procedures 
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using phorphoramidite or analogous protocols in which individual bases are added 
sequentially without the use of a polymerase, e.g. such as is found in automated solid 
phase synthesis protocols, and the like, are of particular interest, where such techniques 
are well known to those of skill in the art. 
5 In determining the specific oligonucleotides of the probe compositions, the 

oligonucleotide should be chosen so that is capable of hybridizing to a region of the target 
nucleic acid or gene having a sequence unique to that gene. Different methods may be 
employed to choose the specific region of the gene to which the oligonucleotide probe is 
to hybridize. Thus, one can use a random approach based on availability of a gene of 

10 interest. However, instead of using a random approach which is based on availability of a 
gene of interest, a rational design approach may also be employed to choose the optimal 
sequence for the hybridization array. Preferably, the region of the gene that is selected in 
preparing the oligonucleotide probe is chosen based on the following criteria. First, the 
sequence that is chosen as the target specific sequence should yield an oligonucleotide 

15 probe that does not cross-hybridize with, or is homologous to, any other oligonucleotide 
probe for other spots present on the array that do not correspond to the target gene. 
Second, the sequence should be chosen such that the oligonucleotide probe has a low 
homology to a nucleotide sequence found in any other gene, whether or not the gene is to 
be represented on the array from the same species of origin. As such, sequences that are 

20 avoided include those foimd in: highly expressed gene products, structural RNAs, 

repeated sequences found in the RNA sample to be tested with the array and sequences 
found in vectors. A further consideration is to select sequences which provide for minimal 
or no secondary structure, structure which allows for optimal hybridization but low non- 
specific binding, equal or similar thermal stabilities, and optimal hybridization 

25 characteristics. A final consideration is to select probe sequences that give rise to probes 
which efficiently hybridize to their corresponding target and do not suffer from substantial 
non-specific hybridization events. Finally, all of the probe sequences on the array are 
preferably chosen such that they exhibit substantially the same hybridization efficiency to 
their corresponding probes, where the difference in hybridization efficiency between any 
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two probes and their corresponding targets preferably does not exceed about 10 fold, more 
preferably does not exceed about 5 fold and most preferably does not exceed about 3 fold. 

Probes meeting the above criteria can be designed or identified using any 
convenient protocol. A representative protocol includes the following algorithm which is 
5 part of the present invention. In selecting probes according to this representative algorithm 
or process, a unique gene-specific or target specific sequence (one or more regions per 
gene) is first identified based on a sequence homology search algorithm described in 
detail in copending application serial no. 09/053,375, the disclosure of which is herein 
incorporated by reference. In this step, the sequence of all genes represented on the to be 

1 0 produced array and all sequences deposited in GenBank are searched in order to select 
mRNA fragments which are unique for each mRNA or target to be represented on the 
array. A unique sequence is defined as a sequence which at least does not have significant 
homology to any other sequence on the array. For example, where one is interested in 
identifying suitable 80 base long unique probes, sequences which do not have homology 

1 5 of more than about 80% to any consecutive 40 base segment of any of the other probes on 
the array are selected. This step typically results in a reduced population of candidate 
probe sequences as compared to the initial population of possible sequences identified for 
each specific target. 

Of this reduced population of candidate sequences, screening criteria are employed 
20 to exclude non-optimal sequences, where sequences that are excluded or screened out in 
this step include: (a) those with strong secondary structure or self-complementarity (for 
example long hairpins); (b) those with very high (more than 70%) or very low (less than 
40%) GC content; (c) those with long stretches (more than 6) of identical consecutive 
bases or long stretches of sequences enriched in some motifs, purine or pyrimidine 
25 stretches or particular bases, like GAGAGAGA..., GAAGAGAA; and the like. This step 
resuhs in a further reduction in the population of candidate probe sequences. 

In the next step, sequences are selected that have similar melting temperatures or 
thermodynamic stability which will provide similar performance in hybridization assays 
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with target nucleic acids. Of interest is the identification of probes that can participate in 
duplexes whose mehing temperature exceeds 65, usually at least about 75 and more, 
usually at least about 80 °C. 

The final step in this representative design process is to select from the remaining 
5 sequences those sequences which provide for low levels of non-specific hybridization and 
similar high efficiency hybridization with complementary target molecules. This final 
selection is accomplished by practicing the following steps: 

1 . The remaining set of probes which is identified for each target using the above 
10 steps, where this remaining set typically includes at least 1 potential probe, usually 

at least 2 potential probes and more usually at least 3 potential probes, are 
experimentally characterized for their hybridization efficiency and propensity to 
participate in non-specific hybridization events using the following protocol. 

First, an array of at least a portion of the candidate probes for each target to be 
represented on the final array is produced. For example, where three candidate 
probes have been identified for a particular target sequence, these probes are 
attached to the surface of a solid support, along with candidate probes for other 
targets, to produce a test probe array. 

Next, a normalization control target set is prepared, wherein each target in the set 
is complementary to one probe sequence in the array and the various target 
constituents of the set are mixed in similar or identical amounts. The number of 
targets in the set of control targets is usually less than the set of probes in the 
array. Usually the number of targets in the control set is between 50% and 90%, 
but can be between 10 and 100%, of the number of test probes on the array 
surface. As such, not all of the probe sequences on the test array will have a 
corresponding or complementary target in the target control set. For example, 
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15 2. 



20 

3. 



where three different candidate probes have been identified for each of 10 different 
mRNA targets, a test probe array of 30 different oligonucleotide probes is 
prepared. Next, a control set of target nucleic acids which includes targets that 
correspond to 5 of the 10 different mRNA targets represented on the array is 
produced, where the control set includes a target that is complementary to each 
different probe corresponding to 1 of the 5 different mRNAs represented in the 
control target set, i.e. the control target set includes 15 different targets- 1 target 
for each of the 15 probes on the array that correspond to the 5 different mRNAs 
represented in the control target set. (While the above procedure has been 
described in terms of using a target population that corresponds to less than all of 
the probes on the array so that non-specific hybridization can be determined, other 
protocols also may be employed. For example, one may use a population of targets 
that corresponds to all of the probes on the array, where at least a portion of the 
targets are distinguishable from the remaining portion or portions, e.g. by label, 
mass etc. Following hybridization, the targets hybridized to each probe can be 
detected and both the efficiency of the probe for its true target and its propensity 
for non-specific hybridization can be determined). 

4. Following generation of the control set of targets, the control set is hybridized with 
the test probe array under stringent conditions and hybridization signals are 
detected. The intensity of the signal for those probes which have a corresponding 
labeled complementary target in the hybridization solution is used as a measure for 
determining the hybridization efficiency of that probe, as well as differences in 
hybridization efficiency of different candidate probes for different targets. For 
those probes on the array which do not have complementary labeled target 
sequences in control set, the intensity of hybridization signal generated by each of 
these probes is used to identify the level of non-specific hybridization that 
characterizes these probes. 
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5. The above steps are repeated with one or more additional control sets of target 
nucleic acids in order to get comprehensive information concerning the 
hybridization efficiency and level of non-specific hybridization for each candidate 
5 of the candidate probes on the array. The number of different sets of control 

targets that are employed in this process is generally at least two, more commonly 
at least four and most commonly at least ten. 



6. From the above steps^ probe sequences meeting the following criteria are 

10 identified for use as long oligonucleotide probes in the arrays of the subject 

invention. First, candidate probes that exhibit a high efficiency of hybridization for 
their corresponding targets are identified. In many embodiments, candidate probes 
having substantially the same hybridization efficiency for the respective targets are 
identified, where any two probes to different targets have substantially the same 

1 5 hybridization efficiency for their respective targets if the differences in 

hybridization efficiency of the two probes does not exceed 10-fold, where 
differences of less than about 5-fold and often less than about 3 -fold are preferred. 
Of these identified probes, probes that show substantial cross hybridization or non- 
specific hybridization are excluded, where a probe that shows non-specific 

20 hybridization of up to at least 5-fold, more commonly 20-fold and most commonly 

50-fold less than the level of gene-specific hybridization between the probe and its 
corresponding target are excluded in this step. In other words, in the above assay 
hybridizations, those probes that exhibit a signal that is at within 5-fold less, 
usually at least 20-foId less and more usually within 50-fold less of the signal 

25 generated by probes and their complementary targets are excluded as being probes 

with unacceptably high propensities for participating in non-specific hybridization 
events. 



The above algorithm or process is used to design the long oligonucleotide probes 
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that are present on the arrays of the subject invention. Steps 1 to 6 can be repeated if, in 
the first round of selection for particular targets no array candidate probes were identified. 
Once the design or sequence of the probes is identified, the long oligonucleotide probes 
may be synthesized according to any convenient protocol, as mentioned above, e.g. via 
5 phosphoramidite processes. 

Following synthesis of the subject long oligonucleotide probes, the probes are 
stably associated with the surface of the solid support. This portion of the preparation 
process typically involves deposition the probes, e.g. a solution of the probes, onto the 
surface of the substrate, where the deposition process may or may not be coupled with a 

1 0 covalent attachment step, depending on how the probes are to be stably attached to the 
substrate surface, e.g. via electrostatic interactions, covalent bonds, etc. The prepared 
oligonucleotides may be spotted on the support using any convenient methodology, 
including manual techniques, e.g. by micro pipette, Inkjet, pins, etc., and automated 
protocols. Of particular interest is the use of an automated spotting device, such as the 

1 5 BioGrid Array er (Biorobotics). 

Where desired, the long oligonucleotides can be covalently bonded to the substrate 
surface using a number of different protocols. For example, functionally active groups 
such as amino, etc., can be introduced onto the 5' or 3' ends of the oligonucleotides, where 
the introduced fimctionalities are then reacted with active surface groups on the substrate 

20 to provide the covalent linkage. In certain preferred embodiments, the long 

oligonucleotide probes are covalently bonded to the surface of the substrate using the 
following protocol. In this process, the probes are covalently attached to the substrate 
surface under denaturing conditions. Typically, a denaturing composition of each probe is 
prepared and then deposited on the substrate surface. By denaturing composition is meant 

25 that the probe molecules present in the composition are not participating in secondary 
structures, e.g. through self-hybridization or hybridization to other molecules in the 
composition. The denaturing composition, typically a fluid composition, may be any 
composition which inhibits the formation of hydrogen bonds between complementary 
nucleotide bases. Thus, compositions of interest are those that include a denaturing agent, 
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e.g. urea, formamide, sodium thiocyanate, etc., as well as solutions having a high pH, e.g. 
12 to 13.5, usually 12.5 to 13, or a low pH, e.g. 1 to 4, usually 1 to 3; and the Uke. In many 
preferred embodiments, the composition is a strongly alkaline solution of the long 
oligonucleotide, where the composition comprises a base, e.g. sodium hydroxide, lithium 
5 hydroxide, potassium hydroxide, ammonium hydroxide, tetramethyl ammonium 

hydroxide, ammonium hydroxide, etc, in sufficient amounts to impart the desired high pH 
to the composition, e.g. 12.5 to 13.0. The concentration of long oligonucleotide in the 
composition typically ranges from about 0.1 to 10 ^M, usually from about 0.5 to 5 /uM. 
Following deposition of the denaturing composition of the long oligonucleoide probe onto 

10 the substrate surface, the deposited probe is exposed to UV radiation of sufficient 

wavelength, e.g. from 250 to 350 nm, to cross link the deposited probe to the surface of 
the substrate. The irradiation wavelength for this process typically ranges from about 50 to 
1000 mJoules, usually from about 100 to 500 mJoules, where the duration of the exposure 
typically lasts from about 20 to 600 sec, usually from about 30 to 120 sec. 

1 5 The above protocol for covalent attachment results in the random co valent binding 

of the long oligonucleotide probe to the substrate surface by one or more attachment sites 
on the probe, where such attachment may optionally be enhanced through inclusion of 
oligodT regions at one or more ends of the oligonucleotides, as discussed supra. An 
important feature of the above process is that reactive moieties, e.g. amino, that are not 

20 present on naturally occurring oligonucleotides are not employed in the subject methods. 
As such, the subject methods are suitable for use with oligonucleotides that do not include 
moieties that are not present on naturally occurring nucleic acids. 

The above described covalent attachment protocol may be used with a variety of 
different types of substrates. Thus, the above described protocols can be employed with 

25 solid supports, such as glass, plastics, membranes, e.g. nylon, and the like. The surfaces 
may or may not be modified. For example, the nylon surface may be charge neutral or 
positively charged, where such substrates are available from a number of commercial 
sources. For glass surfaces, in many embodiments the glass surface is modified, e.g. to 
display reactive functionalities, such as amino, phenyl isothiocyanate, etc. 
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Methods of Using the Subject Arrays 



The subject arrays find use in a variety of different applications in which one is 
interested in detecting the occurrence of one or more binding events between target 
5 nucleic acids and probes on the array and then relating the occurrence of the binding 
event(s) to the presence of a target(s) in a sample. In general, the device will be contacted 
with the sample suspected of containing the target under conditions sufficient for binding 
of any target present in the sample to complementary oligonucleotides present on the 
array. Generally, the sample will be a fluid sample and contact will be achieved by 
10 introduction of an appropriate volume of the fluid sample onto the array surface, where 
introduction can be through delivery ports, direct contact, deposition, and the like. 

Generation of Labeled Target 

1 5 Targets may be generated by methods known in the art. mRNA can be labeled and 

used directly as a target, or converted to a labeled cDNA target. Alternatively, an excess 
of synthetic labeled oligonucleotide target which is complementary to the probes on the 
array can be hybridized with the mRNA, followed by separation of any unbound target 
from the hybridized fraction or isolation of the hybridized fraction. The hybridized 

20 fraction can then hybridized to the array to reveal the expression pattern of the cellular 
source from which the mRNA was derived. Usually, mRNA is labeled non-specifically 
(randomly) directly using chemically, photochemically or enzymatically activated labeling 
compounds, such as photobiotin (Clontech, Palo Alto, CA), Dig-Chem-Link (Boehringer), 
and the like. In another way, mRNA target can be labeled specifically in the sequences 

25 which are complementary to the probes. This specific labeling can be achieved by using 
covalent or non-covalent binding of additional labeled oligonucleotides (or mimetics) to 
the target sequences which flank the probe complementary sequence or the 
complementary probe sequence. The hybridized fraction of labeled oligonucleotides with 
mRNA can be purified or separated from the non-hybridized fraction and then hybridized 
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to the array. Generally, methods for generating labeled cDNA probes include the use of 
oligonucleotide primers. Primers that may be employed include oligo dT, random primers, 
e.g. random hexamers and gene specific primers, as described in PCT/US98/10561, the 
disclosure of which is herein incorporated by reference. 

5 Where gene specific primers are employed, the gene specific primers are 

preferably those primers that correspond to the different oligonucleotide spots on the 
array. Thus, one will preferably employ gene specific primers for each different 
oligonucleotide that is present on the array, so that if the gene is expressed in the 
particular cell or tissue being analyzed, labeled target will be generated from the sample 

10 for that gene. In this manner, if a particular gene present on the array is expressed in a 

particular sample, the appropriate target will be generated and subsequently identified. For 
each target represented on the array, a single gene specific primer may be employed or a 
plurality of different gene specific primers may be employed, where when a plurality are 
used to produce the target, the number will generally not exceed about 3. Generally, in 

1 5 preparing the target from template nucleic acid, e.g. mRNA, the gene specific primers will 
hybridize to a region of the template that is downstream from the region to which the 
probes are homologous, e.g. to which the probes are complementary or have the same 
sequence. The distance from oligonucleotide probe sequence and primer binding site 
generally does not exceed about 500 nt, usually does not exceed about 300 nt and more 

20 usually does not exceed about 200 nt. However, in certain embodiments the gene specific 
primers may be partially or completely complementary to the oligonucleotide probes. The 
cDNA probe can be further amplified by PGR or can be converted (linearly amplified) 
using phage coded RNA polymerase transcription of dsDNA. See PCT/US98/1056, the 
disclosure of which is herein incorporated by reference. 

25 In many embodiments, the target that is generated in this step is a linear target 

which is devoid of any secondary structure, e.g. as produced by target intramolecular 
interactions such as hydrogen bonds. However, in certain embodiments, it may be 
desirable to generate a conformationally restricted to constrained target, e.g. a target that 
forms a hairpin loop structure under the hybridization conditions in which the target is 
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employed. One means of producing hairpin loop targets is to employ primers that include 
an anchoring sequence in addition to priming sequence in the enzymatic target generation 
step. The anchoring domain of the primer, which is 5' of the priming domain, is a domain 
that is complementary to a region of the first strand cDNA distal to the 5' end that is 
5 generated during target synthesis, where the 5' distal region to which the anchor is 

complementary is sufficiently separated from the 5' end of the cDNA such that the cDNA 
forms a hairpin loop structure in which the anchor sequence of the 5' distal region to 
which the anchor sequence is complementary form the stem structure. The sequence of the 
anchor domain of the primer is typically chosen to provide for a loop that ranges in size 
1 0 from about 20 to 200 nt, usually fi-om about 30 to 1 00 nt and more usually from about 40 
to 80 nt. The primers used to generate these hairpin loop targets are described by the 
following formula: 

5'-NxNp-3' 

wherein 

15 N is dGMP, dCMP, dAMP and dTMP; 

p is an integer ranging from 12 to 35, usually from 15 to 30 and more usually from 
18 to 25, such that Np is the priming domain of the primer, and may be a gene specific 
domain, as described above, or an oligo dT domain; and 

X is an integer ranging from 3 to 30, usually from 5 to 20 and more usually from 5 
20 to 15, wherein Nx is the anchor domain and is complementary to a 5' distal portion of the 
first strand cDNA that is complementary to the mRNA of interest which is to be 
represented as target. 

A variety of different protocols may be used to generate the labeled target nucleic 
acids, as is known in the art, where such methods typically rely in the enzymatic 
25 generation of the labeled target using the initial primer. Labeled primers can be employed 
to generate the labeled target. Ahematively, label can be incorporated during first strand 
synthesis or subsequent synthesis labeling or amplification steps, including chemical or 
enzymatic labeling steps, in order to produce labeled target. Representative methods of 
producing labeled target are disclosed in PCT/US98/10561, the disclosure of which is 
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herein incorporated by reference. 
Hybridization and Detection 

5 As mentioned above, following preparation of the target nucleic acid from the 

tissue or cell of interest, the target nucleic acid is then contacted with the array under 
hybridization conditions, where such conditions can be adjusted, as desired, to provide for 
an optimum level of specificity in view of the particular assay being performed. Suitable 
hybridization conditions are well known to those of skill in the art and reviewed in 

10 Maniatis et al, supra and WO 95/21944. Of particular interest in many embodiments is the 
use of stringent conditions during hybridization, i.e. conditions that are optimal in terms 
of rate, yield and stability for specific probe-target hybridization and provide for a 
minimum of non-specific probe/target interaction. Stringent conditions are known to those 
of skill in the art. In the present invention, stringent conditions are typically characterized 

1 5 by temperatures ranging from 1 5 to 35, usually 20 to 30 ""C less than the melting 

temperature of the probe target duplexes, which melting temperature is dependent on a 
number of parameters, e.g. temperature, buffer compositions, size of probes and targets, 
concentration of probes and targets, etc. As such, the temperature of hybridization 
typically ranges from about 55 to 70, usually from about 60 to 68 ""C. In the presence of 

20 denaturing agents, the temperature may range from about 35 to 45, usually from about 37 
to 42 °C. The stringent hybridization conditions are further typically characterized by the 
presence of a hybridization buffer, where the buffer is characterized by one or more of the 
following characteristics: (a) having a high salt concentration, e.g. 3 to 6 x SSC (or other 
salts with similar concentrations); (b) the presence of detergents, like SDS (from 0.1 to 

25 20%), triton XlOO (from 0.01 to 1%), monidet NP40 (from 0.1 to 5%) etc.; (c) other 
additives, like EDTA (typically from 0.1 to 1//M), tetramethylammonium chloride; (d) 
accelerating agents, e.g. PEG, dextran sulfate (5 to 10 %), CTAB, SDS and the like; (e) 
denaturing agents, e.g. formamide, urea etc.; and the like. 

In analyzing the differences in the population of labeled target nucleic acids 
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generated from two or more physiological sources using the arrays described above, in 
certain embodiments each population of labeled target nucleic acids are separately 
contacted to identical probe arrays or together to the same array under conditions of 
hybridization, preferably under stringent hybridization conditions, such that labeled target 
5 nucleic acids hybridize to complementary probes on the substrate surface. In yet other 
embodiments, labeled target nucleic acids are combined with a distinguishably labeled 
standard or control target nucleic acids followed by hybridization of the combined 
populations to the array surface, as described in application serial no. 09/298,361; the 
disclosure of which is herein incorporated by reference. 

10 Where all of the target sequences comprise the same label, different arrays will be 

employed for each physiological source (where different could include using the same 
array at different times). Alternatively, where the labels of the targets are different and 
distinguishable for each of the different physiological sources being assayed, the 
opportunity arises to use the same array at the same time for each of the different target 

15 populations. Examples of distinguishable labels are well known in the art and include: 
two or more different emission wavelength fluorescent dyes, like Cy3 and Cy5, two or 
more isotopes with different energy of emission, like ^^P and ^^P, gold or silver particles 
with different scattering spectra, labels which generate signals xmder different treatment 
conditions, like temperature, pH, treatment by additional chemical agents, etc., or generate 

20 signals at different time points after treatment. Using one or more enzymes for signal 

generation allows for the use of an even greater variety of distinguishable labels, based on 
different substrate specificity of enzymes (alkaline phosphatase/peroxidase). 

Following hybridization, non-hybridized labeled nucleic acid is removed from the 
support surface, conveniently by washing, generating a pattern of hybridized nucleic acid 

25 on the substrate surface. A variety of wash solutions are known to those of skill in the art 
and may be used. 

The resultant hybridization patterns of labeled nucleic acids may be visualized or 
detected in a variety of ways, with the particular manner of detection being chosen based 
on the particular label of the target nucleic acid, where representative detection means 



B,F&FRef:CLON-015 
ClontechRef: P-103 

FADOCUMENT\CLON\01 5\patent application.wpd 



-32- 



include scintillation counting, autoradiography, fluorescence measurement, colorimetric 
measurement, light emission measurement, Hght scattering, and the like. 

Following detection or visualization, the hybridization patterns may be compared 
to identify differences between the patterns. Where arrays in which each of the different 
5 probes corresponds to a known gene are employed, any discrepancies can be related to a 
differential expression of a particular gene in the physiological sources being compared. 

The provision of appropriate controls on the arrays permits a more detailed 
analysis that controls for variations in hybridization conditions, cross-hybridization, non- 
specific binding and the like. Thus, for example, in a preferred embodiment, the 

10 hybridization array is provided with normalization controls as described supra. These 
normalization controls are probes complementary to control target sequences added in a 
known concentration to the sample. Where the overall hybridization conditions are poor, 
the normalization controls will show a smaller signal reflecting reduced hybridization. 
Conversely, where hybridization conditions are good, the normalization controls will 

15 provide a higher signal reflecting the improved hybridization. Normalization of the signal 
derived from other probes in the array to the normalization controls thus provides a 
control for variations in hybridization conditions. Normalization control is also useful to 
adjust (e.g. correct) for differences which arise from the array quality, the mRNA sample 
quality, efficiency of first-strand synthesis, etc. Typically, normalization is accomplished 

20 by dividing the measured signal from the other probes in the array by the average signal 
produced by the normalization controls. Normalization may also include correction for 
variations due to sample preparation and amplification. Such normalization may be 
accomplished by dividing the measured signal by the average signal from the sample 
preparation/ amplification control probes. The resulting values may be multiplied by a 

25 constant value to scale the results. 

In certain embodiments, normalization controls are often unnecessary for useful 
quantification of a hybridization signal. Thus, where optimal probes have been identified, 
the average hybridization signal produced by the selected optimal probes provides a good 
quantified measure of the concentration of hybridized nucleic acid. However, 
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normalization controls may still be employed in such methods for other purposes, e.g. to 
account for array quality, mRNA sample quality, etc. 

Utility 

5 

The subject methods find use in, among other applications, differential gene 
expression assays. Thus, one may use the subject methods in the differential expression 
analysis of: (a) diseased and normal tissue, e.g. neoplastic and normal tissue, (b) different 
tissue or tissue types; (c) developmental stage; (d) response to external or internal 
10 stimulus; (e) response to treatment; and the like. The subject arrays therefore find use in 
broad scale expression screening for drug discovery, diagnostics and research, as well as 
studying the effect of a particular active agent on the expression pattern of genes in a 
particular cell, where such information can be used to reveal drug toxicity, 
carcinogenicity, etc., environmental monitoring, disease research and the like. 

15 

Kits 

Also provided are kits for performing analyte binding assays using the subject 
devices, where kits for carrying out differential gene expression analysis assays are 

20 preferred. Such kits according to the subject invention will at least comprise the subject 
arrays. The kits may further comprise one or more additional reagents employed in the 
various methods, such as primers for generating target nucleic acids, dNTPs and/or 
rNTPs, which may be either premixed or separate, one or more uniquely labeled dNTPs 
and/or rNTPs, such as biotinylated or Cy3 or Cy5 tagged dNTPs, gold or silver particles 

25 with different scattering spectra, or other post synthesis labeling reagent, such as 

chemically active derivatives of fluorescent dyes, enzymes, such as reverse transcriptases, 
DNA polymerases, RNA polymerases, and the like, various buffer mediums, e.g. 
hybridization and washing buffers, prefabricated probe arrays, labeled probe purification 
reagents and components, like spin columns, etc., signal generation and detection 
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reagents, e.g. streptavidin-alkaline phosphatase conjugate, chemifluorescent or 
chemiluminescent substrate, and the like. 

The following examples are offered by way of illustration and not by way of 
5 limitation. 

EXPERIMENTAL 

In the following examples, all percentages are by weight and all solvent mixture 
proportions are by volume unless otherwise noted. 

Example 1 - Generation of ^^P-labeled hybridization target. 
Step A. cDNA Synthesis/Labeling Procedure 

The 10-(il reaction described below convert 1 |Lig of synthetic control RNA into ^^P- 
labeled first-strand cDNA. 

For each labeling reaction: 

1 . Prepare enough master mix for all labeling reactions and 1 extra reaction to ensure 
sufficient volume. For each 10-|al labeling reaction, mix the following reagents: 

2 ^1 SxFirst-strand buffer (250 Tris-HCl pH8.3; 375 mM KCl; 15 mM MgCl^) 
1 \i\ lOxdNTP mix (500 dGTP, 500 \iM dCTP, 500 dXTP, 5 dATP) 
4 ]il [a-''P]dATP (Amersham, 2500 Ci/mmol, 10 mCi/ml) 
1 jil MMLY reverse transcriptase (Amersham, 200 units/fii) 

8 jil Final volume 
25 

2. Combine the following in a 0.5-ml PGR test tube: 

1 |Lig (1 |il) control s64 RNA 

30 GGCCA GGATACCAAA GCCTTACAGG ACTTCCTCCT CAGTGTGCAG ATGTGCCCAG GTAATCGAGA 



10 



15 



20 
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CACTTACTTT 



CACCTGCTTC AGACTCTGAA GAGGCTAGAT CGGAGGGATG AGGCCACTGC 



ACTCTGGTGG 



AGGCTGGAGG CCCAAACTAA GGGGTCACAT GAAGATGCTC TGTGGTCTCT 



CCCCCTGTAC 



CTAGAAAGCT ATTTGAGCTG GATCCGTCCC TCTGATCGTG ACGCCTTCCT 



TGAAGAATTT 



CGGACATCTC TGCCAAAGTC TTGTGACCTG TAGCTGCC (SEQ ID NO: 01) 



5 



1 jLll 



gene-specific primer s64 ( 0.2 ^iM ) 



CGGCCAGGATACCAAAGCCTTACAG (SEQ ID NO: 02) 

10 The control s64 RNA provided above was synthesized by T7 transcription from cDNA 
fragment corresponding to the human DNA repair protein XRCC9 (GB accession number 
U70310) as described in more details in patent application serial no. 09/298,361, the 
disclosure of which is herein incorporated by reference. 

15 3. Add ddH20 to a final volume of 3 ^il. 

4. Mix contents and spin the tubes briefly in a microcentrifuge. 

5. Incubate the tubes in preheated PGR thermocycler at 70^G for 2 min. 

6. Reduce temperature in thermocycle down to 50^G and incubate for 2 min. 

7. Add 8 lil of master mix to each reaction test tube. 

20 8. Mix the contents of the test tubes by gentle pipetting. 

9. Incubate the tubes in PGR thermocycler for 20 min at 50^C. 

10. Stop the reaction by adding 1 \il of lOx termination mix (0.1 M EDTA, 1 mg/ml 
glycogen). 

25 Step B. Golumn Ghromatography 

To purify the ^^P-labeled cDNAs from unincorporated ^^P-labeled nucleotides and small 
(<0.1- kb) cDNA fragments , follow this procedure for each test tube: 

30 1 . Remove GHROMA SPIN-200 column (CLONTEGH) from refrigerator and warm up at 
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room temperature for about 1 hour. Invert the column several times to completely re- 
suspend the gel matrix. 

Note: Check for air bubbles in the column matrix. If bubbles are visible, re-suspend the 
matrix in the in the column buffer (ddH20)by inverting the column again. 
5 2. Remove the bottom cap from the column, and then slowly remove the top cap. 

3. Place the column into a 1.5-ml microcentrifuge tube. 

4. Let the water drain through the column by gravity flow until you can see the surface of 
the gel beads in the column matrix. The top of the column matrix should be at 0.75-ml 
mark on the wall of the column. If the column contains less matrix, adjust the volume of 

1 0 the matrix to 0.75-ml mark using matrix from another column. 

5. Discard the collected water and proceed with purification. 

6. Carefully and slowly apply the sample to the center of the gel bed's flat surface and 
allow sample to be fiilly absorbed into the resin bed before proceeding to the next step. Do 
not allow any sample to flow along the inner wall of the column. 

15 7. Apply 25 |il of ddHsO and allow the water to completely drain out of the column. 

8. Apply 200 (il of ddHjO and allow the buffer to completely drain out of the column until 
there is no liquid left above the resin bed. 

9. Transfer column to a clean 1 .5-ml microcentrifuge tube. 

10. To collect the first fraction add 100 |al of ddH20 to the column and allow the water to 
20 completely drain out of the column. 

1 1. To collect the second, third and fourth fractions repeart steps 9-10. 

12. Place the tubes with fractions 1-4 in a scintillation counter empty vials (do not add 
scintillation cocktail to the tubes or vials), and obtain Cerenkov counts for each fraction. 
Count the entire sample in the tritium channel. 

25 13. Pool the fractions (usually fractions 2-3) which show the highest Cerenkov counts. 
Waist column and the fractions (usually fraction 1 and 4) which show less than 10% 
counts from peak fractions. Total incorporation into peak fractions should be 2-5x10^ 
cpm. 
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Example 2. Preparation of Amynopropyl-glass. 



1 . Prepare wash solution: to get 2 liters, dissolve 200g NaOH in 600ml water and 
5 make up volume to 1 liter (20% w/v). To this solution add 1 liter ethanoL This 

makes 10% NaOH in 50% EtOH. Wash glass in this solution on orbital shaker 
overnight, (slides are placed in rack) 

2. Transfer rack(s) with slides into bath with MilliQ water and wash on shaker for 
1 0 1 5-20 min, repeat this step one more time. 

3. Transfer slides into bath with acetone and wash on shaker for 15-20 min. Repeat 
this step two more times. Dispose acetone from first wash and keep acetone from 
T"^ and 3'"^ washes. (When doing this procedure again, use 2""^ wash as first, 3''' as 

1 5 second and for the 3'"* wash use fresh acetone. 

4. Prepare in advance 5% solution of water in acetone (5% water - 95% acetone). 

5. During last wash step prepare 0.5% solution of aminopropyltriethoxysilane 
20 (Sigma, cat No A3648) in acetone-water mixture from step 4. 

6. Transfer slides from last acetone wash into silanization solution and incubate for 2 
hours at room temperature on orbital shaker. 

25 7. Transfer slides into MilliQ water and wash for 20 minutes. 

8. Transfer slides into acetone and wash for 20 min, repeat this step 2 more times. 
These acetone washes are to be disposed. 
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9. 



Preheat oven at llO'^C 



10 



10. Remove rack with slides from the last acetone wash and transfer it into preheated 
oven. As some acetone still remains on slides and on rack's surfaces, the smelt 
becomes quite intensive. Exhaust duct should be open after putting slides into 
oven and may be closed after first 30 minutes of baking. 

1 1 . Program oven to bake slides at 1 lO'^C for 3 hours and then shut down or cool 
down to room temperature. It is convenient to do this step overnight. 

12. After baking is oven, slides are ready for printing using "thiocyanate method". If 
the printing will not be done right away, slides may be kept in clean boxes inside 
dry cabinets. 



1 5 The following steps are for preparation of PDITC-sUdes. 

1 . Prepare a mixture of Pyridine and Dimethy Iformamide ( 1 0% pyridine and 90% 
DMF). Prepare only as much as necessary. This mixture cannot be stored. 

2. Dissolve 1 ,4-Phenylenediisothiocyanate in the Pyridine-DMF mixture at 0. 1 % 
20 concentration (Ig per liter) on stirrer. Prepare this solution only as much as 

necessary and only when ready to proceed with next steps. This solution cannot be 
stored. The solution should be light yellow-green in color. 

3. Pour the solution in a tray and transfer tray(s) with amino-modified slides into the 
solution. Close the tray with the lid and shake on orbital shaker at low speed for 2 

25 hours. 

4. Transfer rack(s) with slides into a tray with acetone and wash on shaker for 10-15 
minutes. Repeat this step 2 more times by transferring rack(s) into trays with fresh 
acetone. 

5. After last wash quickly transfer racks with slides into vacuum oven and dry in 
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vacuum at room temperature for 20-30 minutes. Vacuum should be applied as fast 
as possible. 

6. Dispose Pyridine-DMF mixture and acetone washes into flammable wastes 
container. 

5 7. Transfer slides for storage into dry cabinets. Make sure the desiccant in the dry 
cabinet is good (blue in color). 



Example 3. Printing of oligonucleotides. 

Oligonucleotides used in this experiment were dissolved in 0.1 M NaOH at 100 
10 nanogramm per microliter and printed on PDITC modified glass surface. Amount of DNA 
deposited was about 5 ng per spot. After printing slides were baked at 80 °C for 2 hours 
and then UV crosslinked (254 nm UV lamp) for 1 min. 



Example 4. Preparation of Array 
15 Using the above protocol describe in Examples 5 & 6, an array having the 

characteristics of Table 1 was prepared. Each of the probe oligonucleotides was prepared 
using an automated nucleic acid synthesizer. 
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Example 5 - Hybridization ^^P-labeled cDNA Target with oligo glass ARRAY 

1 . Prepare a solution of 6x SSC buffer containing 0.1% SDS. 

2. Place glass slide with printed oligo DNA in a hybridization chamber and add 2 ml of the 
solution prepared in step 1 . 

5 3. Prehybridize for 30 min at 60X. 

4. Mix labeled cDNA probe (Example 1, about 200 ^il, total about 2-5x10^ cpm) with 
1/lOth of the total volume ( about 22 nl) of lOxdenaturing solution (1 M NaOH, 10 
mM EDTA) and incubate at 65 °C for 20 min. Then add 5 |il (1 \ig/\xl) of human Cot-1 
DNA , and equal volume (about 225 ^il) of 2x Neutralizing solution (IM NaHP04, pH 

1 0 7.0) and continue incubating at 65 °C for 1 0 min. 

5. Add the mixture prepared in Step 4 to the 2 ml of solution prepared in Step 1 . Make sure 
that the two solutions are mixed together thoroughly. 

6. Pour out the prehybridization solution and discard. Replace with the solution prepared in 
Step 5. 

15 7. Hybridize overnight at 60°C. 

8. Carefully remove the hybridization solution and discard in an appropriate container. 

Place the glass slides in a washing chamber with 20 ml of Wash Solution l(2x SSC, 

0.1% SDS). Wash the ARRAY for 10 min with continuous agitation at room temperature. 

Repeat this step four times. 
20 9. Perform one additional 1 0-min wash in 20 ml of Wash Solution 2 (0. 1 x SSC, 0. 1 % 

SDS) with continuous agitation at room temperature. 

10. Using forceps, remove the cDNA ARRAY from the container and shake excess the wash 
solution. Rinse with distilled water and let the array dry on air. 

1 1 . Expose the glass slide Array to X-ray film at -70 X with an intensifying screen. 
25 Alternatively, use a phosphorimager (Molecular Dynamics). 

Example 6. Assay for Hybridization Efficiency 

Using the arrays and above protocols, the hybridization efficiency of each probe of 
different length on the array described in Example 4 was assayed using ^^P labeled target 
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complementary for each of the probes. The results of this assay are provided in Fig. 1 . The results 
demonstrate that a significant increase in hybridization efficiency is achieved with 
oligonucleotide probes having a length greater than 50 nt. 

5 It is evident from the above discussion that the subject arrays provide for a 

significant advance in the field. The subject invention provides for arrays of probes in 
which all of the probes on the array have substantially the same level of of high 
hybridization efficiency for their respective targets and exhibit a minimal level of non- 
specific hybridization. As such, the subject arrays eliminate the need for using multiple 

1 0 probe sequences for each target of interest or using mismatch control probes for each 
target, which is at least desired if not required with other array formats. In addition, the 
arrays are readily fabricated using non PGR based protocols, where the fabrication process 
is suitable for use in high throughput manufacturing. As such, the subject arrays combine 
the benefits of high throughput manufacturability of short oUgonucleotide arrays with the 

1 5 benefits of high specificity observed in cDNA arrays. Accordingly, the subject invention 
represents a significant contribution to the art. 

All publications and patent applications cited in this specification are herein 
incorporated by reference as if each individual publication or patent application were 
20 specifically and individually indicated to be incorporated by reference. The citation of any 
publication is for its disclosure prior to the filing date and should not be construed as an 
admission that the present invention is not entitled to antedate such publication by virtue 
of prior invention. 

25 Although the foregoing invention has been described in some detail by way of 

illustration and example for purposes of clarity of understanding, it is readily apparent to 
those of ordinary skill in the art in light of the teachings of this invention that certain 
changes and modifications may be made thereto without departing from the spirit or scope 
of the appended claims. 
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WHAT IS CLAIMED IS: 



1 . An array comprising at least one pattern of probe oligonucleotide spots stably 
associated with the surface of a solid support, wherein each probe oligonucleotide spot of 

5 said pattern corresponds to a target nucleic acid and comprises an oligonucleotide probe 
composition made up of long oligonucleotide probes that range in length from about 50 to 
120 nt. 

2. The array according to Claim 1 , wherein two or more different target nucleic acids 
10 are represented in said pattern. 

3. The array according to Claim 2, wherein each probe oligonucleotide spot in said 
pattern corresponds to a different target nucleic acid. 

1 5 4. The array according to Claim 1 , wherein each long oligonucleotide probe on said 
array has a high hybridization efficiency for its respective target. 

5. The array according to Claim 1 , wherein each long oligonucleotides of said array 
has a low propensity for non-specific hybridization. 

20 

6. The array according to Claim 4, wherein each of said probe long oligonucleotides 
of said array exhibit substantially the same high hybridization efficiency for their 
respective targets. 

25 7. The array according to Claim 1 , wherein said long oligonucleotide probes are 
covalently attached to said surface of said substrate. 

8. The array according to Claim 7, wherein said each of said long oligonucleotide 
probes is cross-linked to the surface of said support at at least one site. 
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9. The array according to Claim 1, wherein each of said oligonucleotide probes is 
cross-linked to the surface of said support at at least two sites. 

10. The array according to Claim 1 , wherein the density of spots on said array does not 
5 exceed about 1 000/cm^. 

11. The array according to Claim 1 0, wherein the density of spots on said array does 
not exceed about 400/cm^. 

10 12. The array according to Claim 1 , wherein the number of spots on said array ranges 
from about 50 to 50,000. 

13. The array according to Claim 1 , wherein the number of spots on said array ranges 
from about 50 to 10,000. 

15 

14. An array comprising a pattern of probe oligonucleotide spots covalently bound to 
the surface of a solid support, wherein each probe oligonucleotide spot corresponds to a 
target nucleic acid and comprises a long oligonucleotide probe composition made up of 
long oligonucleotides of from about 60 to 100 nt in length, wherein each of said long 

20 oligonucleotide probes exhibits substantially the same high hybridization efficiency with 
its respective target and low level of non-specific hybridization. 

15. The array according to Claim 14, wherein ten or more different target nucleic acids 
are represented in said pattem. 

25 

16. The array according to Claim 15, wherein each probe oligonucleotide spot in said 
pattem corresponds to a different target nucleic acid. 

17. The array according to Claim 15, wherein two or more probe oligonucleotide spots 
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in said pattern correspond to the same target nucleic acid. 

1 8. The array according to Claim 14, wherein the length of each of said unique 
oligonucleotides ranges from about 65 to 90 nucleotides. 

5 

19. The array according to Claim 14, wherein the density of spots on said array does 
not exceed about 1000/cml 

20. The array according to Claim 14, wherein the density of spots on said array does 
1 0 not exceed about 400/cm^. 

21 . The array according to Claim 14, wherein the number of spots on said array ranges 
from about 50 to 50,000. 

15 22. The array according to Claim 1 4, wherein the number of spots on said array ranges 
from about 50 to 10,000. 

23. An array comprising a pattern of probe oligonucleotide spots of a density that does 
not exceed about 400 spots/cm^ covalently attached to the surface of a glass support, 

20 wherein each probe oligonucleotide spot corresponds to a different target nucleic acid and 
comprises an oligonucleotide probe composition made up of long oligonucleotides of 
from about 65 to 90 nt in length, wherein each of said long oligonucleotides has 
substantially the same high hybridization efficiency for its corresponding target and the 
substantially the same low level of non-specific hybridization. 

25 

24. A method of preparing an array comprising at least one pattern of probe 
oligonucleotide spots stably associated with the surface of a solid support, wherein each 
probe oligonucleotide spot corresponds to a target nucleic acid and comprises an 
oligonucleotide probe composition made up of long oligonucleotide probes ranging in 
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length from about 50 to 120 nt, said method comprising: 
generating said long oligonucleotide probes; and 

stably associating said long oligonucleotide probes on the surface of said solid 
support in a manner sufficient to produce said array. 

5 

25. The method according to Claim 24, wherein said stably associating comprises 
covalently attaching said probes to said surface. 

26. The method according to Claim 25, wherein said covalently attaching comprises 
10 cross-Hnking. 

27. The method according to Claim 26, wherein said cross-linking is by exposure to 
UV light. 

1 5 28. The method according to Claim 24, wherein said stably associating comprises 
contacting said long oligonucleotide probes to said surface under denaturing conditions. 

29. The method according to Claim 24, wherein said surface is glass. 

20 30. A hybridization assay comprising the steps of: 

contacting at least one labeled target nucleic acid sample with an array according 
to Claim 1 under conditions sufficient to produce a hybridization pattern; and 
detecting said hybridization pattern. 

25 31. The method according to Claim 30, wherein said method further comprises 
washing said array prior to said detecting step. 

32. The method according to Claim 30, wherein said method further comprises 
preparing said labeled target nucleic acid sample. 
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33. The method according to Claim 32, wherein said preparing comprises conjugating 
a detectable label to a functionalized target nucleic acid. 

34. The method according to Claim 30, where said method further comprises: 
5 generating a second hybridization pattern; and 

comparing said hybridization patterns. 

35. A kit for use in a hybridization assay, said kit comprising: 
an array according to Claim 1 . 

10 
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ABSTRACT 

Long oligonucleotide arrays, as well as methods for their preparation and use in 
hybridization assays, are provided. The subject arrays are characterized in that at least a 
5 portion of the probes of the array, and usually all of the probes of the array, are long 
oligonucleotides, e.g. oligonucleotides having a length of from about 50 to 120 nt. Each 
long oligonucleotide probe on the array is preferably chosen to exhibit substantially the 
same high target binding efficiency and substantially the same low non-specific binding 
under conditions in which the array is employed. The subject arrays find use in a number 
10 of different applications, e.g. differential gene expression analysis. 
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